A New Framework for Mandarin Lvcsr Based on One-pass Decoder
نویسندگان
چکیده
This paper describes a new framework based on one-pass and decision tree based class-triphone acoustic modeling for Mandarin LVCSR. Compared with the multi-pass decoder, it should be more knowledgeable and efficient as all sources are used at the same time when the decoder could be well organized and optimized. We give a detail about the organization of our one-pass decoder and how to handle the search space explosion by giant number of triphone and cross-word extension dealing with unknown right context including the tone context. The experimental results show that the character error rate (CER) was reduced to 13.04% for open LM and 2.8% for close LM with non-tonal class-triphone model based on the male test database from China National Hi-Tech Project 863. And with tonal class-triphone model, CER reaches 10.31% and has a 21% relative character error reduction compared with non-tonal class-triphone model.
منابع مشابه
Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR
NLPR has been with long efforts on Mandarin speech recognition. This paper reports our recent process in this field with several significant novel characteristics: 1) Very large speech databases are used to learn more robust acoustic model; 2) Acoustic model has evolved from non-tonal class-triphone to tonal class-triphone based on tone-embedded decision tree, namely unified tone & triphone mod...
متن کاملA multi-pass error detection and correction framework for Mandarin LVCSR
We previously proposed a multi-pass framework for Large Vocabulary Continuous Speech Recognition (LVCSR). The objective of this framework is to apply sophisticated linguistic models for recognition, while maintaining a balance between complexity and efficiency. The framework is composed of three passes: initial recognition, error detection and error correction. This paper presents and evaluates...
متن کاملDecoding-time prediction of non-verbalized punctuation
This paper presents novel methods that integrate lexical prediction of non-verbalized punctuations with Viterbi decoding for Large Vocabulary Conversational Speech Recognition (LVCSR) in a single pass. We describe two different approaches one based on a modified finite state machine representation of language models and one based on an extension of an LVCSR decoder. We discuss advantages over t...
متن کاملDevelopment of Cslu Lvcsr: the 1997 Darpa Hub4 Evaluation System
This paper presents the CSLU Broadcast News transcription system used in the DARPA 1997 evaluation. The system was built using the softwares developed for the CSLU LVCSR project started in January 1997. This 25K-word vocabulary system used continuous HMMs for acoustic modeling and the standard backo trigram as the language model. The search used a single pass decoder with MLLR based adaptation ...
متن کاملHierarchical processing of the modulation spectrum for GALE Mandarin LVCSR system
This paper aims at investigating the use of TANDEM features based on hierarchical processing of the modulation spectrum. The study is done in the framework of the GALE project for recognition of Mandarin Broadcast data. We describe the improvements obtained using the hierarchical processing and the addition of features like pitch and short-term critical band energy. Results are consistent with ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000